{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For this project, we’ll be using the Amazon customer reviews dataset which can be found on [Kaggle](https://www.kaggle.com/bittlingmayer/amazonreviews). The dataset contains a total of 4 million reviews with each review labelled to be of either positive or negative sentiment. You can run the code implementation in this article on FloydHub using their GPUs on the cloud by clicking the following link. This will speed up the training process significantly.\n",
    "\n",
    "[![Run on FloydHub](https://static.floydhub.com/button/button-small.svg)](https://floydhub.com/run?template=https://github.com/gabrielloye/LSTM_Sentiment-Analysis)\n",
    "\n",
    "Our goal at the of this implementation will be to create an LSTM model that can accurately classify and distinguish the sentiment of a review. To do so, we’ll have to start with some data-preprocessing, defining and training the model, followed by assessing the model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For our data pre-processing steps, we'll be using *regex*, *numpy* and the *NLTK (Natural Language Toolkit)* library for some simple NLP helper functions. As the data is compressed in the *bz2* format, we'll use the Python *bz2* module to read the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19",
    "_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[nltk_data] Downloading package punkt to /usr/share/nltk_data...\n",
      "[nltk_data]   Package punkt is already up-to-date!\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import bz2\n",
    "from collections import Counter\n",
    "import re\n",
    "import nltk\n",
    "import numpy as np\n",
    "nltk.download('punkt')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "_cell_guid": "79c7e3d0-c299-4dcb-8224-4455121ee9b0",
    "_uuid": "d629ff2d2480ee46fbb7e2d37f6b5fab8052498a"
   },
   "outputs": [],
   "source": [
    "train_file = bz2.BZ2File('../input/amazon_reviews/train.ft.txt.bz2')\n",
    "test_file = bz2.BZ2File('../input/amazon_reviews/test.ft.txt.bz2')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_file = train_file.readlines()\n",
    "test_file = test_file.readlines()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of training reivews: 3600000\n",
      "Number of test reviews: 400000\n"
     ]
    }
   ],
   "source": [
    "print(\"Number of training reivews: \" + str(len(train_file)))\n",
    "print(\"Number of test reviews: \" + str(len(test_file)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This dataset contains a total of 4 million reviews - 3.6 million training and 0.4 million testing. We won't be using the entire dataset to save time. However, if you have the computing power and capacity, go ahead and train the model on a larger portion of data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "num_train = 800000 #We're training on the first 800,000 reviews in the dataset\n",
    "num_test = 200000 #Using 200,000 reviews from test set\n",
    "\n",
    "train_file = [x.decode('utf-8') for x in train_file[:num_train]]\n",
    "test_file = [x.decode('utf-8') for x in test_file[:num_test]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "__label__2 Stuning even for the non-gamer: This sound track was beautiful! It paints the senery in your mind so well I would recomend it even to people who hate vid. game music! I have played the game Chrono Cross but out of all of the games I have ever played it has the best music! It backs away from crude keyboarding and takes a fresher step with grate guitars and soulful orchestras. It would impress anyone who cares to listen! ^_^\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(train_file[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we'll have to extract out the labels from the sentences. The data is the format ```__label__1/2 <sentence>```, therefore we can easily split it accordingly. Positive sentiment labels are stored as 1 and negative are stored as 0.\n",
    "\n",
    "We will also change all *url*s to a standard \"<url\\>\" as the exact url is irrelevant to the sentiment in most cases."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Extracting labels from sentences\n",
    "\n",
    "train_labels = [0 if x.split(' ')[0] == '__label__1' else 1 for x in train_file]\n",
    "train_sentences = [x.split(' ', 1)[1][:-1].lower() for x in train_file]\n",
    "\n",
    "    \n",
    "test_labels = [0 if x.split(' ')[0] == '__label__1' else 1 for x in test_file]\n",
    "test_sentences = [x.split(' ', 1)[1][:-1].lower() for x in test_file]\n",
    "\n",
    "# Some simple cleaning of data\n",
    "\n",
    "for i in range(len(train_sentences)):\n",
    "    train_sentences[i] = re.sub('\\d','0',train_sentences[i])\n",
    "\n",
    "for i in range(len(test_sentences)):\n",
    "    test_sentences[i] = re.sub('\\d','0',test_sentences[i])\n",
    "\n",
    "# Modify URLs to <url>\n",
    "\n",
    "for i in range(len(train_sentences)):\n",
    "    if 'www.' in train_sentences[i] or 'http:' in train_sentences[i] or 'https:' in train_sentences[i] or '.com' in train_sentences[i]:\n",
    "        train_sentences[i] = re.sub(r\"([^ ]+(?<=\\.[a-z]{3}))\", \"<url>\", train_sentences[i])\n",
    "        \n",
    "for i in range(len(test_sentences)):\n",
    "    if 'www.' in test_sentences[i] or 'http:' in test_sentences[i] or 'https:' in test_sentences[i] or '.com' in test_sentences[i]:\n",
    "        test_sentences[i] = re.sub(r\"([^ ]+(?<=\\.[a-z]{3}))\", \"<url>\", test_sentences[i])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "del train_file, test_file"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After quickly cleaning the data, we will do tokenization of the sentences, which is a standard NLP task. \n",
    "Tokenization is the task of splitting a sentence into the individual tokens, which can be words or punctuation, etc.\n",
    "There are many NLP libraries that are able to do this, such as *spaCy* or *Scikit-learn*, but we will be using *NLTK* here as it has one of the faster tokenizers.\n",
    "\n",
    "The words will then be stored in a dictionary mapping the word to its number of appearances. These words will become our **vocabulary**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.0% done\n",
      "2.5% done\n",
      "5.0% done\n",
      "7.5% done\n",
      "10.0% done\n",
      "12.5% done\n",
      "15.0% done\n",
      "17.5% done\n",
      "20.0% done\n",
      "22.5% done\n",
      "25.0% done\n",
      "27.5% done\n",
      "30.0% done\n",
      "32.5% done\n",
      "35.0% done\n",
      "37.5% done\n",
      "40.0% done\n",
      "42.5% done\n",
      "45.0% done\n",
      "47.5% done\n",
      "50.0% done\n",
      "52.5% done\n",
      "55.0% done\n",
      "57.5% done\n",
      "60.0% done\n",
      "62.5% done\n",
      "65.0% done\n",
      "67.5% done\n",
      "70.0% done\n",
      "72.5% done\n",
      "75.0% done\n",
      "77.5% done\n",
      "80.0% done\n",
      "82.5% done\n",
      "85.0% done\n",
      "87.5% done\n",
      "90.0% done\n",
      "92.5% done\n",
      "95.0% done\n",
      "97.5% done\n",
      "100% done\n"
     ]
    }
   ],
   "source": [
    "words = Counter() #Dictionary that will map a word to the number of times it appeared in all the training sentences\n",
    "for i, sentence in enumerate(train_sentences):\n",
    "    #The sentences will be stored as a list of words/tokens\n",
    "    train_sentences[i] = []\n",
    "    for word in nltk.word_tokenize(sentence): #Tokenizing the words\n",
    "        words.update([word.lower()]) #Converting all the words to lower case\n",
    "        train_sentences[i].append(word)\n",
    "    if i%20000 == 0:\n",
    "        print(str((i*100)/num_train) + \"% done\")\n",
    "print(\"100% done\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To remove typos and words that likely don't exist, we'll remove all words from the vocab that only appear once throughout.\n",
    "To account for **unknown** words and **padding**, we'll have to add them to our vocabulary as well. Each word in the vocabulary will then be assigned an integer index and thereafter mapped to this integer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Removing the words that only appear once\n",
    "words = {k:v for k,v in words.items() if v>1}\n",
    "# Sorting the words according to the number of appearances, with the most common word being first\n",
    "words = sorted(words, key=words.get, reverse=True)\n",
    "# Adding padding and unknown to our vocabulary so that they will be assigned an index\n",
    "words = ['_PAD','_UNK'] + words\n",
    "# Dictionaries to store the word to index mappings and vice versa\n",
    "word2idx = {o:i for i,o in enumerate(words)}\n",
    "idx2word = {i:o for i,o in enumerate(words)}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With the mappings, we'll convert the words in the sentences to their corresponding indexes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "for i, sentence in enumerate(train_sentences):\n",
    "    # Looking up the mapping dictionary and assigning the index to the respective words\n",
    "    train_sentences[i] = [word2idx[word] if word in word2idx else word2idx['_UNK'] for word in sentence]\n",
    "\n",
    "for i, sentence in enumerate(test_sentences):\n",
    "    # For test sentences, we have to tokenize the sentences as well\n",
    "    test_sentences[i] = [word2idx[word.lower()] if word.lower() in word2idx else word2idx['_UNK'] for word in nltk.word_tokenize(sentence)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the last pre-processing step, we'll be padding the sentences with 0s and shortening the lengthy sentences so that the data can be trained in batches to speed things up."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Defining a function that either shortens sentences or pads sentences with 0 to a fixed length\n",
    "\n",
    "def pad_input(sentences, seq_len):\n",
    "    features = np.zeros((len(sentences), seq_len),dtype=int)\n",
    "    for ii, review in enumerate(sentences):\n",
    "        if len(review) != 0:\n",
    "            features[ii, -len(review):] = np.array(review)[:seq_len]\n",
    "    return features"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "seq_len = 200 #The length that the sentences will be padded/shortened to\n",
    "\n",
    "train_sentences = pad_input(train_sentences, seq_len)\n",
    "test_sentences = pad_input(test_sentences, seq_len)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Converting our labels into numpy arrays\n",
    "train_labels = np.array(train_labels)\n",
    "test_labels = np.array(test_labels)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A padded sentence will look something like this, where 0 represents the padding: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_sentences[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our dataset is already split into *training* and *testing* data. However, we still need a set of data for validation during training. Therefore, we will split our test data by half into a validation set and a testing set. A detailed explanation on dataset splits can be found [here](https://machinelearningmastery.com/difference-test-validation-datasets/)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "split_frac = 0.5\n",
    "split_id = int(split_frac * len(test_sentences))\n",
    "val_sentences, test_sentences = test_sentences[:split_id], test_sentences[split_id:]\n",
    "val_labels, test_labels = test_labels[:split_id], test_labels[split_id:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, this is the point where we’ll start working with the PyTorch library. We’ll first define the datasets from the sentences and labels, followed by loading them into a data loader. We set the batch size to 256. This can be tweaked according to your needs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "from torch.utils.data import TensorDataset, DataLoader\n",
    "\n",
    "train_data = TensorDataset(torch.from_numpy(train_sentences), torch.from_numpy(train_labels))\n",
    "val_data = TensorDataset(torch.from_numpy(val_sentences), torch.from_numpy(val_labels))\n",
    "test_data = TensorDataset(torch.from_numpy(test_sentences), torch.from_numpy(test_labels))\n",
    "\n",
    "batch_size = 400\n",
    "\n",
    "train_loader = DataLoader(train_data, shuffle=True, batch_size=batch_size)\n",
    "val_loader = DataLoader(val_data, shuffle=True, batch_size=batch_size)\n",
    "test_loader = DataLoader(test_data, shuffle=True, batch_size=batch_size)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also check if we have any GPUs to speed up our training time by many folds. If you’re using FloydHub with GPU to run this code, the training time will be significantly reduced.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "GPU is available\n"
     ]
    }
   ],
   "source": [
    "# torch.cuda.is_available() checks and returns a Boolean True if a GPU is available, else it'll return False\n",
    "is_cuda = torch.cuda.is_available()\n",
    "\n",
    "# If we have a GPU available, we'll set our device to GPU. We'll use this device variable later in our code.\n",
    "if is_cuda:\n",
    "    device = torch.device(\"cuda\")\n",
    "    print(\"GPU is available\")\n",
    "else:\n",
    "    device = torch.device(\"cpu\")\n",
    "    print(\"GPU not available, CPU used\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "torch.Size([400, 200]) torch.Size([400])\n"
     ]
    }
   ],
   "source": [
    "dataiter = iter(train_loader)\n",
    "sample_x, sample_y = dataiter.next()\n",
    "\n",
    "print(sample_x.shape, sample_y.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "At this point, we will be defining the architecture of the model. At this stage, we can create Neural Networks that have deep layers or and large number of LSTM layers stacked on top of each other. However, a simple model such as the one below works quite well and requires much less training time. We will be training our own word embeddings in the first layer before the sentences are fed into the LSTM layer.\n",
    "\n",
    "The final layer is a fully connected layer with a sigmoid function to classify whether the review is of positive/negative sentiment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch.nn as nn\n",
    "\n",
    "class SentimentNet(nn.Module):\n",
    "    def __init__(self, vocab_size, output_size, embedding_dim, hidden_dim, n_layers, drop_prob=0.5):\n",
    "        super(SentimentNet, self).__init__()\n",
    "        self.output_size = output_size\n",
    "        self.n_layers = n_layers\n",
    "        self.hidden_dim = hidden_dim\n",
    "        \n",
    "        self.embedding = nn.Embedding(vocab_size, embedding_dim)\n",
    "        self.lstm = nn.LSTM(embedding_dim, hidden_dim, n_layers, dropout=drop_prob, batch_first=True)\n",
    "        self.dropout = nn.Dropout(0.2)\n",
    "        self.fc = nn.Linear(hidden_dim, output_size)\n",
    "        self.sigmoid = nn.Sigmoid()\n",
    "        \n",
    "    def forward(self, x, hidden):\n",
    "        batch_size = x.size(0)\n",
    "        x = x.long()\n",
    "        embeds = self.embedding(x)\n",
    "        lstm_out, hidden = self.lstm(embeds, hidden)\n",
    "        lstm_out = lstm_out.contiguous().view(-1, self.hidden_dim)\n",
    "        \n",
    "        out = self.dropout(lstm_out)\n",
    "        out = self.fc(out)\n",
    "        out = self.sigmoid(out)\n",
    "        \n",
    "        out = out.view(batch_size, -1)\n",
    "        out = out[:,-1]\n",
    "        return out, hidden\n",
    "    \n",
    "    def init_hidden(self, batch_size):\n",
    "        weight = next(self.parameters()).data\n",
    "        hidden = (weight.new(self.n_layers, batch_size, self.hidden_dim).zero_().to(device),\n",
    "                      weight.new(self.n_layers, batch_size, self.hidden_dim).zero_().to(device))\n",
    "        return hidden"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Take note that we can actually load pre-trained word embeddings such as GloVe or fastText which can increase the model’s accuracy and decrease training time.\n",
    "\n",
    "With this, we can instantiate our model after defining the arguments. The output dimension will only be 1 as it only needs to output 1 or 0. The learning rate, loss function and optimizer are defined as well.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SentimentNet(\n",
      "  (embedding): Embedding(225965, 400)\n",
      "  (lstm): LSTM(400, 512, num_layers=2, batch_first=True, dropout=0.5)\n",
      "  (dropout): Dropout(p=0.2)\n",
      "  (fc): Linear(in_features=512, out_features=1, bias=True)\n",
      "  (sigmoid): Sigmoid()\n",
      ")\n"
     ]
    }
   ],
   "source": [
    "vocab_size = len(word2idx) + 1\n",
    "output_size = 1\n",
    "embedding_dim = 400\n",
    "hidden_dim = 512\n",
    "n_layers = 2\n",
    "\n",
    "model = SentimentNet(vocab_size, output_size, embedding_dim, hidden_dim, n_layers)\n",
    "model.to(device)\n",
    "print(model)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "lr=0.005\n",
    "criterion = nn.BCELoss()\n",
    "optimizer = torch.optim.Adam(model.parameters(), lr=lr)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we can start training the model. For every 1000 steps, we’ll be checking the output of our model against the validation dataset and saving the model if it performed better than the previous time.\n",
    "The state_dict is the model’s weights in PyTorch and can be loaded into a model with the same architecture at a separate time or script altogether."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 1/2... Step: 1000... Loss: 0.173995... Val Loss: 0.173999\n",
      "Validation loss decreased (inf --> 0.173999).  Saving model ...\n",
      "Epoch: 1/2... Step: 2000... Loss: 0.184995... Val Loss: 0.165258\n",
      "Validation loss decreased (0.173999 --> 0.165258).  Saving model ...\n",
      "Epoch: 2/2... Step: 3000... Loss: 0.153857... Val Loss: 0.165597\n",
      "Epoch: 2/2... Step: 4000... Loss: 0.145156... Val Loss: 0.168370\n"
     ]
    }
   ],
   "source": [
    "epochs = 2\n",
    "counter = 0\n",
    "print_every = 1000\n",
    "clip = 5\n",
    "valid_loss_min = np.Inf\n",
    "\n",
    "model.train()\n",
    "for i in range(epochs):\n",
    "    h = model.init_hidden(batch_size)\n",
    "    \n",
    "    for inputs, labels in train_loader:\n",
    "        counter += 1\n",
    "        h = tuple([e.data for e in h])\n",
    "        inputs, labels = inputs.to(device), labels.to(device)\n",
    "        model.zero_grad()\n",
    "        output, h = model(inputs, h)\n",
    "        loss = criterion(output.squeeze(), labels.float())\n",
    "        loss.backward()\n",
    "        nn.utils.clip_grad_norm_(model.parameters(), clip)\n",
    "        optimizer.step()\n",
    "        \n",
    "        if counter%print_every == 0:\n",
    "            val_h = model.init_hidden(batch_size)\n",
    "            val_losses = []\n",
    "            model.eval()\n",
    "            for inp, lab in val_loader:\n",
    "                val_h = tuple([each.data for each in val_h])\n",
    "                inp, lab = inp.to(device), lab.to(device)\n",
    "                out, val_h = model(inp, val_h)\n",
    "                val_loss = criterion(out.squeeze(), lab.float())\n",
    "                val_losses.append(val_loss.item())\n",
    "                \n",
    "            model.train()\n",
    "            print(\"Epoch: {}/{}...\".format(i+1, epochs),\n",
    "                  \"Step: {}...\".format(counter),\n",
    "                  \"Loss: {:.6f}...\".format(loss.item()),\n",
    "                  \"Val Loss: {:.6f}\".format(np.mean(val_losses)))\n",
    "            if np.mean(val_losses) <= valid_loss_min:\n",
    "                torch.save(model.state_dict(), './state_dict.pt')\n",
    "                print('Validation loss decreased ({:.6f} --> {:.6f}).  Saving model ...'.format(valid_loss_min,np.mean(val_losses)))\n",
    "                valid_loss_min = np.mean(val_losses)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After we’re done training, it's time to test our model on a dataset it has never seen before - our test dataset.\n",
    "We'll first load the model weights from the point where the validation loss is the lowest.\n",
    "We can calculate the accuracy of the model to see how accurate our model’s predictions are."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Loading the best model\n",
    "model.load_state_dict(torch.load('./state_dict.pt'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Test loss: 0.161\n",
      "Test accuracy: 93.906%\n"
     ]
    }
   ],
   "source": [
    "test_losses = []\n",
    "num_correct = 0\n",
    "h = model.init_hidden(batch_size)\n",
    "\n",
    "model.eval()\n",
    "for inputs, labels in test_loader:\n",
    "    h = tuple([each.data for each in h])\n",
    "    inputs, labels = inputs.to(device), labels.to(device)\n",
    "    output, h = model(inputs, h)\n",
    "    test_loss = criterion(output.squeeze(), labels.float())\n",
    "    test_losses.append(test_loss.item())\n",
    "    pred = torch.round(output.squeeze()) #rounds the output to 0/1\n",
    "    correct_tensor = pred.eq(labels.float().view_as(pred))\n",
    "    correct = np.squeeze(correct_tensor.cpu().numpy())\n",
    "    num_correct += np.sum(correct)\n",
    "        \n",
    "print(\"Test loss: {:.3f}\".format(np.mean(test_losses)))\n",
    "test_acc = num_correct/len(test_loader.dataset)\n",
    "print(\"Test accuracy: {:.3f}%\".format(test_acc*100))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We managed to achieve an accuracy of **93.8%** with this simple LSTM model! This shows the effectiveness of LSTM in handling such sequential tasks.\n",
    "\n",
    "This result was achieved with just a few simple layers and without any hyperparameter tuning. There are so many other improvements that can be made to increase the model’s effectiveness, and you are free to attempt to beat this accuracy by implementing these improvements!\n",
    "\n",
    "Some improvement suggestions are as follow:\n",
    "- Running a hyperparameter search to optimize your configurations. A guide to the techniques can be found [here](https://blog.floydhub.com/guide-to-hyperparameters-search-for-deep-learning-models/). \n",
    "- Increasing the model complexity\n",
    "    - E.g. Adding more layers/ using bidirectional LSTMs\n",
    "- Using pre-trained word embeddings such as [GloVe](https://nlp.stanford.edu/projects/glove/) embeddings"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
